8 research outputs found

    SCALABLE INTEGRATED CIRCUIT SIMULATION ALGORITHMS FOR ENERGY-EFFICIENT TERAFLOP HETEROGENEOUS PARALLEL COMPUTING PLATFORMS

    Get PDF
    Integrated circuit technology has gone through several decades of aggressive scaling.It is increasingly challenging to analyze growing design complexity. Post-layout SPICE simulation can be computationally prohibitive due to the huge amount of parasitic elements, which can easily boost the computation and memory cost. As the decrease in device size, the circuits become more vulnerable to process variations. Designers need to statistically simulate the probability that a circuit does not meet the performance metric, which requires millions times of simulations to capture rare failure events. Recent, multiprocessors with heterogeneous architecture have emerged as mainstream computing platforms. The heterogeneous computing platform can achieve highthroughput energy efficient computing. However, the application of such platform is not trivial and needs to reinvent existing algorithms to fully utilize the computing resources. This dissertation presents several new algorithms to address those aforementioned two significant and challenging issues on the heterogeneous platform. Harmonic Balance (HB) analysis is essential for efficient verification of large postlayout RF and microwave integrated circuits (ICs). However, existing methods either suffer from excessively long simulation time and prohibitively large memory consumption or exhibit poor stability. This dissertation introduces a novel transient-simulation guided graph sparsification technique, as well as an efficient runtime performance modeling approach tailored for heterogeneous manycore CPU-GPU computing system to build nearly-optimal subgraph preconditioners that can lead to minimum HB simulation runtime. Additionally, we propose a novel heterogeneous parallel sparse block matrix algorithm by taking advantages of the structure of HB Jacobian matrices as well as GPU’s streaming multiprocessors to achieve optimal workload balancing during the preconditioning phase of HB analysis. We also show how the proposed preconditioned iterative algorithm can efficiently adapt to heterogeneous computing systems with different CPU and GPU computing capabilities. Extensive experimental results show that our HB solver can achieve up to 20X speedups and 5X memory reduction when compared with the state-of-the-art direct solver highly optimized for twelve-core CPUs. In nowadays variation-aware IC designs, cell characterizations and SRAM memory yield analysis require many thousands or even millions of repeated SPICE simulations for relatively small nonlinear circuits. In this dissertation, for the first time, we present a massively parallel SPICE simulator on GPU, TinySPICE, for efficiently analyzing small nonlinear circuits. TinySPICE integrates a highly-optimized shared-memory based matrix solver and fast parametric three-dimensional (3D) LUTs based device evaluation method. A novel circuit clustering method is also proposed to improve the stability and efficiency of the matrix solver. Compared with CPU-based SPICE simulator, TinySPICE achieves up to 264X speedups for parametric SRAM yield analysis without loss of accuracy

    TinySPICE plus: Scaling up statistical SPICE simulations on GPU leveraging shared-memory based sparse matrix solution techniques

    No full text
    © 2016 ACM. TinySPICE was a SPICE simulator on GPU developed to achieve dramatic speedups in statistical simulations of small nonlinear circuits, such as standard cell designs and SRAMs. While TinySPICE can perform circuit simulations much faster than traditional SPICE tools for small circuits, it may not be efficient for handling relatively large logic/memory circuit designs due to the embedded dense MNA matrix solver that can result in fast growing memory cost with increasing matrix size. In this work, we present TinySPICE Plus, a full-blown statistical SPICE simulation engine on GPU platform that integrates a highly-optimized shared-memory based sparse matrix solver that is capable of dealing with much larger circuits than TinySPICE while achieving orders of magnitude speedup over traditional CPU-based SPICE simulation engine. Extensive experimental results show that TinySPICE Plus can achieves over 70X speedups for parametric yield analysis of SRAM arrays and variation-aware logic circuit characterizations

    Transient-simulation guided graph sparsification approach to scalable harmonic balance (HB) analysis of post-layout RF circuits leveraging heterogeneous CPU-GPU computing systems

    No full text
    Harmonic Balance (HB) analysis is key to efficient verification of large post-layout RF and microwave integrated circuits (ICs). This paper introduces a novel transient-simulation guided graph sparsification technique, as well as an efficient runtime performance modeling approach tailored for heterogeneous manycore CPU-GPU computing system to build nearly-optimal subgraph preconditioners that can lead to minimum HB simulation runtime. Additionally, we propose a novel heterogeneous parallel sparse block matrix algorithm by taking advantages of the structure of HB Jacobian matrices as well as GPU\u27s streaming multiprocessors to achieve optimal work load balancing during the preconditioning phase of HB analysis. We also show how the proposed preconditioned iterative algorithm can efficiently adapt to heterogeneous computing systems with different CPU and GPU computing capabilities. Extensive experimental results show that our HB solver can achieve up to 20X speedups and 5X memory reduction when compared with the state-of-the-art direct solver highly optimized for eight-core CPUs

    An adaptive graph sparsification approach to scalable harmonic balance analysis of strongly nonlinear post-layout RF circuits

    No full text
    In the past decades, harmonic balance (HB) has been widely used for computing steady-state solutions of nonlinear radio-frequency (RF) and microwave circuits. However, using HB for simulating strongly nonlinear post-layout RF circuits still remains a very challenging task. Although direct solution methods can be adopted to handle moderate to strong nonlinearities in HB analysis, such methods do not scale efficiently with large-scale problems due to excessively long simulation time and prohibitively large memory consumption. In this paper, we present a novel graph sparsification approach for automatically generating preconditioners that can be efficiently applied for simulating strongly nonlinear post-layout RF circuits. Our approach allows to sparsify time-domain circuit modified nodal analysis matrices that can be subsequently leveraged for sparsifying the entire HB Jacobian matrix. We show that the resultant sparsified Jacobian matrix can be used as a robust yet efficient preconditioner in HB analysis. Our experimental results show that when compared with the prior state-of-the-art direct solution method, the proposed solver can more efficiently handle moderate to strong nonlinearities during the HB analysis of RF circuits, achieving up to 20× speedups and 6× memory reductions

    Graph sparsification approaches to scalable integrated circuit modeling and simulations

    No full text
    © 2014 IEEE. Unlike traditional fast SPICE simulation techniques that rely on a variety of approximation approaches to trade off simulation accuracy for greater speed, SPICE-accurate integrated circuit (IC) simulations can truthfully predict circuit electrical behaviors, and therefore become indispensable for verification of large IC designs. Post-layout SPICE-accurate simulation should be able to encapsulate multi-million or even multi-billion devices that are coupled through complex parasitics and become an essential procedure for verification of nowadays nano-scale IC designs. Although many efficient numerical methods have been developed and adopted in the state-of-the-art SPICE-accurate circuit simulators for solving large sparse matrices involved in IC simulations, existing simulators may not be capable of handling extremely large-scale post-layout ICs in that the computation and memory cost can increase exponentially with the increase of circuit sizes and parasitics components. This paper introduces our recent effort in developing \u27truly scalable\u27 SPICE-accurate nonlinear circuit simulation methods that can scale comfortably with extremely large-scale post-layout IC designs without sacrificing accuracy

    A performance-guided graph sparsification approach to scalable and robust SPICE-accurate integrated circuit simulations

    No full text
    To improve the efficiency of direct solution methods in SPICE-accurate integrated circuit (IC) simulations, preconditioned iterative solution techniques have been widely studied in the past decades. However, it is still an extremely challenging task to develop robust yet efficient general-purpose preconditioning methods that can deal with various types of large-scale IC problems. In this paper, based on recent graph sparsification research we propose circuit-oriented general-purpose support-circuit preconditioning (GPSCP) methods to dramatically improve the sparse matrix solution time and reduce the memory cost during SPICE-accurate IC simulations. By sparsifying the Laplacian matrix extracted from the original circuit network using graph sparsification techniques, general-purpose support circuits can be efficiently leveraged as preconditioners for solving large Jacobian matrices through Krylov-subspace iterations. Additionally, a performance model-guided graph sparsification framework is proposed to help automatically build nearly-optimal GPSCP solvers. Our experiment results for a variety of large-scale IC designs show that the proposed preconditioning techniques can achieve up to 18× runtime speedups and 7× memory reduction in DC and transient simulations when compared to state-of-the-art direct solution methods

    An efficient graph sparsification approach to scalable harmonic balance (HB) analysis of strongly nonlinear RF circuits

    No full text
    In the past decades, harmonic balance (HB) has been widely used for computing steady-state solutions of nonlinear radio-frequency (RF) and microwave circuits. However, using HB for simulating strongly nonlinear RF circuits still remains a very challenging task. Although direct solution methods can be adopted to handle moderate to strong nonlinearities in HB analysis, such methods do not scale efficiently with large-scale problems due to excessively long simulation time and huge memory consumption. In this work, we present a novel graph sparsification approach for generating preconditioners that can be efficiently applied for simulating strongly nonlinear RF circuits. Our approach first sparsifies RF circuit matrices that can be subsequently leveraged for sparsifying the entire HB Jacobian matrix. We show that the resultant sparsified Jacobian matrix can be used as a robust yet efficient preconditioner in HB analysis. Our experimental results show that when compared with existing state-of-the-art direct solvers, the proposed HB solver can more efficiently handle moderate to strong nonlinearities during the HB analysis of RF circuits, achieving more than 10X speedups and 8X memory reductions. © 2013 IEEE
    corecore